A Supervised Statistical Data Quantization Method in Machine Learning
نویسندگان
چکیده
Data quantization methods for continuous attributes play an extremely important role in artificial intelligence, data mining and machine learning because discrete values of attributes are required in most classification methods. In this paper, we present a supervised statistical data quantization method. It defines a quantization criterion based on the chi-square statistic to discover accurate merging intervals. In addition, a heuristic quantization algorithm is proposed to achieve a satisfying quantization result with the aim to improve the performance of inductive learning algorithms. Empirical experiments on UCI real data sets show that our proposed algorithm generates a better quantization scheme that improves the classification accuracy of C4.5 decision tree than existing algorithms.
منابع مشابه
Border sensitive fuzzy vector quantization in semi-supervised learning
Abstract. We propose a semi-supervised fuzzy vector quantization method for the classification of incompletely labeled data. Since information contained within the structure of the data set should not be neglected, our method considers the whole data set during the learning process. In difference to known methods our approach uses neighborhood cooperativeness for stable prototype learning known...
متن کاملSupervised Competition Using Joined Growing Neural Gas
Competitive learning is well-known method to process data. Various goals may be achieved using competitive learning such as classification or vector quantization. In this paper, we present a different insight into the principle of supervised competitive learning. An innovative approach to the supervised self-organization is suggested. The method is based on different handling of input data labe...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملStatistical machine learning for data mining and collaborative multimedia retrieval
of thesis entitled: Statistical Machine Learning for Data Mining and Collaborative Multimedia Retrieval Submitted by HOI, Chu Hong (Steven) for the degree of Doctor of Philosophy at The Chinese University of Hong Kong in September 2006 Statistical machine learning techniques have been widely applied in data mining and multimedia information retrieval. While traditional methods, such as supervis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Multimedia
دوره 8 شماره
صفحات -
تاریخ انتشار 2013